Enhanced Spatial Stream of Two-Stream Network Using Optical Flow for Human Action Recognition

نویسندگان

چکیده

Introduction: Convolutional neural networks (CNNs) have maintained their dominance in deep learning methods for human action recognition (HAR) and other computer vision tasks. However, the need a large amount of training data always restricts performance CNNs. Method: This paper is inspired by two-stream network, where CNN deployed to train network using spatial temporal aspects an activity, thus exploiting strengths both achieve better accuracy. Contributions: Our contribution twofold: first, we deploy enhanced stream, it demonstrated that models pre-trained on larger dataset, when used yield good instead entire model from scratch. Second, dataset augmentation technique presented minimize overfitting CNNs, increase size performing various transformations images such as rotation flipping, etc. Results: UCF101 standard benchmark videos, our architecture has been trained validated it. Compared with networks, results outperformed them terms

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Encoding Multi-resolution Two-Stream CNNs for Action Recognition

متن کامل

Two-Stream Convolutional Networks for Action Recognition in Videos

We investigate architectures of discriminatively trained deep Convolutional Networks (ConvNets) for action recognition in video. The challenge is to capture the complementary information on appearance from still frames and motion between frames. We also aim to generalise the best performing hand-crafted features within a data-driven learning framework. Our contribution is three-fold. First, we ...

متن کامل

Hidden Two-Stream Convolutional Networks for Action Recognition

Analyzing videos of human actions involves understanding the temporal relationships among video frames. CNNs are the current state-of-the-art methods for action recognition in videos. However, the CNN architectures currently being used have difficulty in capturing these relationships. State-of-the-art action recognition approaches rely on traditional local optical flow estimation methods to pre...

متن کامل

Two-Stream SR-CNNs for Action Recognition in Videos

Human action is a high-level concept in computer vision research and understanding it may benefit from different semantics, such as human pose, interacting objects, and scene context. In this paper, we explicitly exploit semantic cues with aid of existing human/object detectors for action recognition in videos, and thoroughly study their effect on the recognition performance for different types...

متن کامل

Two-Stream convolutional nets for action recognition in untrimmed video

We extend the two-stream convolutional net architecture developed by Simonyan for action recognition in untrimmed video clips. The main challenges of this project are first replicating the results of Simonyan et al, and then extending the pipeline to apply it to much longer video clips in which no actions of interest are taking place most of the time. We explore aspects of the performance of th...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applied sciences

سال: 2023

ISSN: ['2076-3417']

DOI: https://doi.org/10.3390/app13148003